7 research outputs found

    Machine learning approach for flood risks prediction

    Get PDF
    Flood is one of main natural disaster that happens all around the globe caused law of nature. It has caused vast destruction of huge amount of properties, livestock and even loss of life. Therefore, the needs to develop an accurate and efficient flood risk prediction as an early warning system is highly essential. This study aims to develop a predictive modelling follow Cross-Industry Standard Process for Data Mining (CRISP-DM) methodology by using Bayesian network (BN) and other Machine Learning (ML) techniques such as Decision Tree (DT), k-Nearest Neighbours (kNN) and Support Vector Machine (SVM) for flood risks prediction in Kuala Krai, Kelantan, Malaysia. The data is sourced from 5-year period between 2012 until 2016 consisting 1,827 observations. The performance of each models were compared in terms of accuracy, precision, recall and f-measure. The results showed that DT with SMOTE method performed the best compared to others by achieving 99.92% accuracy. Also, SMOTE method is found highly effective in dealing with imbalance dataset. Thus, it is hoped that the finding of this research may assist the non-government or government organization to take preventive action on flood phenomenon that commonly occurs in Malaysia due to the wet climate

    Bayesian approach to classification of football match outcome

    Get PDF
    The football match outcome prediction particularly has gained popularity in recent years. It attract lots type of fan from the analyst expert, managerial of football team and others to predict the football match result before the match start.There are three types of approaches had been proposed to predict win, lose or draw; and evaluate the attributes of the football team. The approaches are statistical approach, machine learningapproach and Bayesian approach. This paper propose the Bayesian approaches within machine learning approaches such as Naive Bayes (NB), Tree Augmented Naive Bayes (TAN) and General Bayesian Network (K2) to predict the football match outcome. The required of football data is the English Premier League match results for three seasons; 2016 – 2017, 2015 – 2016 and 2014 – 2015 downloaded from http://www.football-data.co.uk. The experimental results showed that TAN achieved the highest predictive accuracy of 90.0 % in average across three seasons among others Bayesian approach (K2 and NB). The result from this research is hope that it can be used in future research for predicting the football match outcome

    An enhanced Bayesian Network prediction model for football matches based on player performance

    Get PDF
    In sports analytics, existing researches have showed that the Bayesian networks (BN) approach has greatly contributed to predicting football match results with considerably high accuracy as compared to other classical statistical and machine learning approaches. However, existing prediction models rely solely on historical team features including the match statistical data as well as team statistical data, together with the historical features of team achievement such as ranking in FIFA, ranking in league and total number of points gained at the end of a season. There is no known work to date that has analysed individual player performance data as part of the parameters used to predict football match results. To address this gap, this research proposes a BN model for match prediction based on player performance data called the Player Performance (PP) model. To validate the performance of the proposed PP model, three existing prediction models were re-implemented and measured for prediction accuracy. The existing models are the General Individual (GI) model, Match Statistical (MS) model, and Team Statistical (TS) model. All BN models were constructed using the Tree Augmented Naive Bayes (TAN) for structural learning. The dataset used was data for the Arsenal Football Club in the English Premier League (EPL) for seasons 2014-2015 and 2015-2016. Apart from the proposed individual player performance data, the dataset includes individual player rating, absence or presence of players in a match, match statistics, and team statistics. Then, the PP model were re-constructed using other machine learning techniques such as k-Nearest Neighbour (kNN) and Decision Tree (DT) in order to compare with BN for prediction accuracy. The experimental results showed two fold; the proposed PP model using BN achieved a higher accuracy in predicting the outcomes for football matches with an overall average predictive accuracy of 63.76% compare to GI model, MS model and TS model as well as higher than PP model using kNN and DT by 1.64% and 6.02%

    Prediction of player position for talent identification in association netball: a regression-based approach

    Get PDF
    Among the challenges in industrial revolutions, 4.0 is managing organizations’ talents, especially to ensure the right person for the position can be selected. This study is set to introduce a predictive approach for talent identification in the sport of netball using individual player qualities in terms of physical fitness, mental capacity, and technical skills. A data mining approach is proposed using three data mining algorithms, which are Decision Tree (DT), Neural Network (NN), and Linear Regressions (LR). All the models are then compared based on the Relative Absolute Error (RAE), Mean Absolute Error (MAE), Relative Square Error (RSE), Root Mean Square Error (RMSE), Coefficient of Determination (R2), and Relative Square Error (RSE). The findings are presented and discussed in light of early talent spotting and selection. Generally, LR has the best performance in terms of MAE and RMSE as it has the lowest values among the three models

    A regression approach for prediction of Youtube views

    Get PDF
    YouTube has grown to be the number one video streaming platform on Internet and home to millions of content creator around the globe. Predicting the potential amount of YouTube views has proven to be extremely important for helping content creator to understand what type of videos the audience prefers to watch. In this paper, we will be introducing two types of regression models for predicting the total number of views a YouTube video can get based on the statistic that are available to our disposal. The dataset we will be using are released by YouTube to the public. The accuracy of both models are then compared by evaluating the mean absolute error and relative absolute error taken from the result of our experiment. The results showed that Ordinary Least Square method is more capable as compared to the Online Gradient Descent Method in providing a more accurate output because the algorithm allows us to find a gradient that is close as possible to the dependent variables despite having an only above average prediction

    Comparative study of football team rating system using elo rating and pi-rating for Switzerland Super League

    Get PDF
    A sports rating system is a system that analyses the results of sports competitions to provide ratings for each team or player. Usually, in a football match, the audience will predict which team will win based on the goals scored by the team at half-time or penalty. This prediction is important because when evaluating match results, it is important to first compare the potential strength of the teams involved in the match. Due to this, the main goal of this research is to compare the performance of the team rating system using Elo Rating and Pi Rating when forecasting match outcomes in association football. The well-known Elo Rating system is used to calculate team ratings, whereas a Pi Rating is used to predict the football match results based on a team’s performance to win the match when playing home or when playing away. Two different techniques are used to generate forecasts. Both types of models can be used to generate pre-game forecasts. The Elo ratings worked better when predicting matches from a large data set. The Pi Rating system applies to any other sport where the score is considered as a good indicator for prediction purposes, as well as determining the relative performances between adversaries. Data used in this study focuses on the dataset football match by Switzerland Super League. The dataset from the Football-Data.co.uk website is a dataset composed of around 1421 data of matches of the Switzerland Super League. The research figures out the classification model based on the Decision Forest classifier is an effective classifier with 68% f-measure for Pi Rating and 73% for f�measure Elo Rating. Therefore, Elo Rating is the best team rating system to predict football competitions
    corecore